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In an era where data drives decisions, the ability to process and interpret complex 
information efficiently has become paramount, especially in the healthcare 
sector. InfraNodus, an advanced text network analysis tool, can be considered at 
the forefront of this data-driven revolution. InfraNodus, developed by Nodus 
Labs, employs cutting-edge algorithms and artificial intelligence (AI) to 
transform textual data into insightful visual representations, enabling users to 
uncover hidden patterns and connections. Thus, this editorial aims to explore the 
capabilities of InfraNodus and its potential to support topic modeling in 
healthcare administration. 

A text network analysis was employed in this study using InfraNodus (Nodus 
Labs, Leeds, UK, https://infranodus.com), an open-source tool for analyzing 
and interpreting fragmented textual data. The analysis processed a text corpus 
containing all provided definitions (Paranyushkin, 2019). InfraNodus transforms 
the text corpus into a network graph revealing prominent topics, their 
connections, and structural gaps. This method enables the analysis of discourse 
structure and diversity by scrutinizing the community layout within the graph. 
Using an algorithm, the tool converts text into a network format, emphasizing 
important words based on how frequently they appear together. Subsequently, 
through network analysis techniques, closely associated concepts or clusters 
(known as topic modeling) are identified, along with determining the most 
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influential nodes (top keywords) (Paranyushkin, 2019). It is worth mentioning 
that InfraNodus is a relatively new tool, and some scholars might not be familiar 
with it. However, several researchers have already published studies (Dicks et 
al., 2021; Feyissa & Zhang, 2023; Giineri & Taddei, 2023) utilizing this tool. 


Steps in Topic Modelling 
The methods for the topic modeling consist of several steps: 

Firstly, abstracts were searched in the most extensive databases: PubMed and 
Scopus. The first search was conducted on 2 October 2022 and then updated on 
30 May 2024 on PubMed with keywords “healthcare administration” and 
“artificial intelligence” and on Scopus with keywords “healthcare,” 


Wh 


“administration,” “artificial,” and “intelligence.” Filtering was done with the 
following criteria: English research articles published between 2022 and 2024. As 
a result, 1413 abstracts were collected (1199 in PubMed and 214 in Scopus). 

Secondly, text cleaning and pre-processing were done using Python 3.11. 
Initially, the text was read from a .txt file and converted to lowercase to ensure 
uniformity. Numerical digits and non-word characters were then removed to 
eliminate noise. Tokenization was performed to split the text into individual 
words (tokens). Subsequently, common English stopwords, which do not 
contribute significant meaning, were removed from the token list. Each token 
was then lemmatized, reducing words to their base or root form to address 
variations in tense and form. This process helps normalize the text data, making 
it more consistent for analysis. After these transformations, the tokens were 
joined back into a single string, forming the cleaned text. Finally, the cleaned text 
was saved to a .txt file for subsequent analysis. These steps ensured that the text 
data was free of irrelevant elements, standardized, and ready for further natural 
language processing tasks, enhancing the quality and accuracy of the analysis. 

The cleaned text was imported into InfraNodus, transforming the content into 
an interactive network graph that revealed key patterns and connections. 
InfraNodus created a network graph where lemmas served as nodes, and words 
were grouped into 4-word windows (4-grams), with edges formed based on 
word co-occurrence within these windows and assigned varying weights 
(Paranyushkin, 2011, 2019). The network graph was visualized using different 
colors and sizes to represent topics, with the Force Atlas algorithm (Jacomy et al., 
2014) enhancing clarity. 

Fourthly, InfraNodus automatically filtered unnecessary elements such as 
articles and auxiliary verbs. Additionally, unrelated words and verbs (e.g., 
“identify, evaluate, improve, address,” etc.) were removed, as the analysis 
focused exclusively on nouns. This process, known as “node filtering,” extracts 
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semantic morphemes—the fundamental meaning units of language—from 
unstructured text. This involves identifying words that contain these 
morphemes, with nouns frequently selected as morphemes in network analysis 
(Kang et al., 2021). In InfraNodus, node filtering can be achieved using features 
like “hidden nodes.” 

Fifthly, the researcher engaged in iterative observation of nodes and 
statements, verifying key terms to ensure their relevance. For example, if a 
prominent node like “declaration” was derived from a funding declaration in an 
article but was unrelated to the study’s topic, it was removed. Similarly, 
frequently recurring concepts/keywords such as “artificial intelligence, AI, 
healthcare administration” were hidden since they primarily appeared in study 
titles. This aimed to uncover the underlying ideas within the statements. 
InfraNodus employed the Jenks elbow cutoff algorithm to assess centrality, a 
simplified version of the K Means algorithm (Paranyushkin, 2011, 2019). This 
method identified and ranked the most influential nodes, significantly 
surpassing others in impact. These influential nodes were determined by their 
high betweenness centrality, indicating their frequent appearance on the shortest 
paths between randomly chosen nodes and their ability to connect different 
communities or by having the highest degree (Paranyushkin, 2011, 2019). The 
researcher opted not to let InfraNodus analyze the data autonomously, as it 
includes all keywords, verbs, and potentially unnecessary words, which could 
influence word clusters. Consequently, the researcher played a crucial role in 
repeatedly refining the keywords and examining the correlations between the 
text corpus and concepts. 

Sixthly, InfraNodus allowed the researchers to evaluate the topical diversity 
score, determining whether the text was biased towards a few central concepts 
or had multiple topical clusters. This score is categorized into four states: Biased 
(Low), Focused (Medium), Diverse (Optimal), and Dispersed (Very High) 
(Paranyushkin, 2011, 2019). The metric considers the modularity measure, with 
thresholds set at >0.4 for medium modularity and >0.65 for high modularity, 
using the Louvain community detection algorithm developed by Blondel et al. 
(2008). It also accounts for the influence distribution measure, which assesses the 
entropy of the distribution of top nodes among the top clusters, the percentage 
of nodes in the top topic, and the relative influence of the top two topical clusters. 
If the text was biased, the researcher repeatedly reviewed the nodes and 
statements to achieve a medium or optimal level of diversity. Once clarity was 
achieved, they proceeded to the next step. 

Seventhly, the researcher had the option to use GPT-4 to synthesize ideas by 
clustering related concepts. GPT-4 was employed to generate names for these 
topical clusters and to explore critical relationships among the concepts 
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(Paranyushkin, 2011, 2019). However, it was not feasible to rely solely on GPT-4; 
the researcher needed to examine and verify the concepts and their associated 
statements thoroughly. 

Lastly, InfraNodus allowed the researchers to perform semantic analysis of 
the statements using the Bert Al model (HuggingFace, 2023), enabling them to 
determine whether the sentiments expressed towards the topic were positive, 
negative, or neutral. 


Results 


Table 1 Top 10 keywords 


Keyword Betweenness Centrality Next Jenks Elbow Point 
model 0.45311554748941300 0.45311554748941300 
patient 0.4458560193587420 0.18935269207501500 
clinical 0.2679975801572900 0.18935269207501500 
learning 0.18935269207501500 0.18935269207501500 
research 0.1385359951603150 0.03508771929824560 
data 0.12704174228675100 0.03508771929824560 
information 0.09981851179673320 0.03508771929824560 
care 0.09013914095583790 0.03508771929824560 
risk 0.03508771929824560 0.03508771929824560 


Table 1 shows the centrality of keywords derived from the text network 
analysis using InfraNodus. The “Betweenness Centrality” metric quantifies the 
significance of each keyword in facilitating relationships between other 
keywords within the network. Notably, the keyword “model” attains the highest 
centrality (0.453), indicating its pivotal role in connecting various concepts. While 
this centrality metric may also reflect the attractiveness of certain keywords to 
authors or publishers due to their significance in scholarly discourse, it primarily 
emphasizes their importance in facilitating communication within the network. 
The following keywords closely follow: “patient” (0.445), “clinical” (0.268), and 
“learning” (0.189). The “Next Jenks Elbow Point” signifies critical thresholds for 
identifying structural shifts in the network. In this context, these values guide the 
identification of prominent keywords vital in shaping the interconnected themes 


WS bh 


and topics. Keywords such as “research,” “data,” “information,” “care,” and 
“risk” also contribute significantly, forming the clusters that support the main 
themes. 

Table 2 and Figure 1 show the thematic clusters in the abstracts reviewed 
related to artificial intelligence in healthcare administration. The Healthcare 
Quality cluster (83% influence) is the most significant, followed by the Risk 
Assessment cluster (26% influence), Clinical Support cluster (25% influence), 


Machine Learning cluster (9% influence), and Healthcare Data cluster (6% 
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influence). Each cluster is characterized by specific keywords, providing insights 
into prevalent themes within the literature. 


Table 2 Topical clusters 


Topical | Cluster | Influence | Total % of Category Keywords 
Cluster Code Nodes | Entries 
1 0 33% 19 25% Healthcare | Patient, data, care, quality, medicine, 


Quality hospital, diagnosis, disease, public, 
informatics, surgery, treatment, image 
Dy 4 26% 12 8% Risk Model, risk, performance, accuracy, 
Assessme dataset, training, language, assessment, 


nt test, diagnostic, prediction, human 
3 2 25% 16 11% Clinical Clinical, research, report, decision, 
Support development, support, work, practice, 


drug, case, service, process, 
implementation, evaluation, application, 


tool 
4 3 9% 8 4% Machine Learning, deep, technique, approach, 
Learning algorithm, method, machine, network 
8) 1 6% 4 2% Healthcare | Information, computer, digital, 
Data engineering 


Notes: Topical diversity: Focused (Medium) | Modularity: 0.36 
Total statements analyzed: 3339 [positive: 66% | negative: 27% | neutral: 7%] (Based on Bert AI model) 


Table 3 Top relations and blind spot 


Top relations / N-grams Blind Spot with Topical connectors: 


1) Machine learning Computer, decision, report, care, quality, research, performance, data 
2) Deep learning 

3) Learning model 
4) Prediction model 


5) Patient care 


It is noteworthy that the “Topical Diversity: Focused” signifies that the range 
of topics within the dataset is considered medium. This suggests a cohesive 
concentration on specific themes relevant to healthcare and clinical research, 
promoting in-depth exploration rather than broad coverage. On the other hand, 
“Modularity: 0.38” suggests a structured organization of the clusters, indicating 
distinct thematic groupings within the dataset. This modular structure enhances 
clarity in the relationships between topics and facilitates targeted analysis of 
interconnected themes, potentially revealing nuanced insights into healthcare 
quality, risk assessment, clinical support, machine learning applications, and 
healthcare data management. 

Utilizing the BERT AI model, which analyzed positive and negative moods, it 
is apparent that sentiments are predominantly positive, accounting for 66% of 
the analyzed statements. These findings suggest that the positivity observed in 
the sentiments reflects the overall trend observed in the analyzed content. This 
suggests a general optimism or favorable perception toward artificial intelligence 
in healthcare. However, it is also notable that there is a proportion of negative 
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sentiment, comprising 27% of the analyzed statements. This indicates areas of 
concern or skepticism regarding the implementation or impact of AI in 
healthcare, potentially highlighting challenges or ethical considerations. The 
remaining 7% of statements are neutral, indicating a balanced representation 
where viewpoints neither strongly favor nor oppose AI in healthcare, suggesting 
areas where further investigation or nuanced understanding may be beneficial. 

Table 3 identifies the top relationships and blind spots within the text network. 
The top N-grams— machine learning, deep learning, learning model, prediction 
model, and patient care —indicate key areas of focus in the text. The blind spots, 
including keywords such as computer, decision, report, care, quality, research, 
performance, and data, suggest areas where additional exploration could 
enhance the understanding of interconnected themes and topics. 
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Figure 1 Topic clusters using InfraNodus 


Summary of the Findings 
The identified topical clusters provide evidence of a robust research landscape 
where AI is increasingly integrated into various facets of healthcare 


administration. 
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The Healthcare Quality cluster’s dominance emphasizes efforts to enhance 
patient care and operational efficiency through data-driven approaches. For 
instance, studies within this cluster may explore how AI algorithms improve 
diagnostic accuracy or optimize resource allocation in healthcare facilities. 
Similarly, the Risk Assessment cluster reflects the healthcare sector’s focus on 
using AI to predict and manage risks associated with patient outcomes and 
operational challenges. This evidence suggests a proactive approach to 
leveraging AI for preventive healthcare strategies and efficient resource 
utilization. 

The Clinical Support cluster’s prominence highlights AI’s role in augmenting 
clinical decision-making processes, supporting evidence-based practices, and 
streamlining healthcare delivery. Moreover, the Machine Learning cluster’s 
focus on advanced algorithms highlights ongoing innovations in AI techniques 
tailored for healthcare data analysis. Evidence-based AI models are increasingly 
used to extract meaningful insights from large datasets, improving disease 
detection, treatment efficacy, and healthcare management strategies. 

Lastly, the Healthcare Data cluster underlines the critical role of AI in 
managing and utilizing healthcare data to drive informed decision-making and 
policy formulation. This includes efforts to ensure data quality, security, and 
interoperability across healthcare systems. 

Overall, the analysis of these topical clusters demonstrates how AI is 
reshaping healthcare administration by addressing key challenges, improving 
patient outcomes, and optimizing healthcare delivery. The evidence supports 
continued exploration and integration of AI technologies to meet evolving 
healthcare needs effectively. 


Conclusion 

This editorial article underlines the transformative potential of InfraNodus in 
analyzing abstracts focused on artificial intelligence in healthcare administration. 
The analysis identifies structured topical clusters—Healthcare Quality, Risk 
Assessment, Clinical Support, Machine Learning, and Healthcare Data —that 
highlight critical research trends and priorities. These findings highlight the 
growing integration of AI technologies to enhance healthcare quality, support 
clinical decision-making, manage risks, and optimize data utilization. The 
positive sentiment towards AI reflects optimism in its potential benefits, ethical 
considerations, and challenges. Utilizing AI insights will be crucial for advancing 
healthcare administration practices and effectively addressing emerging 
healthcare challenges. 
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